Discriminative bilinear language modeling for broadcast transcriptions
نویسندگان
چکیده
A discriminative bilinear language model (DBLM) estimated on the basis of Bayes risk minimization is described. The discriminative language model (DLM) is conventionally trained by using n-gram features. However, given a large amount of training data, the DLM is not necessarily trained efficiently because of the increasing number of unique features. In addition, though some of the n-grams share the same word sequences as contexts, the DLM never reflects this kind of information in that they are not designed to work in a coordinated manner. These disadvantages of utilizing n-gram features could lead to a loss of DLM robustness. We solve these issues by introducing a bilinear network structure to the features aimed at factorizing the contexts shared among the n-grams and estimating the model more robustly. In our proposed language modeling, all the model parameters, such as weight matrices, are estimated according to the objective based on the Bayes risk to be minimized on the training lattices. The experimental results show that our DBLM trained in the lightly-supervised manner significantly reduced the word error rate compared with that of the trigram LM, while the conventional DLM does not yield a significant reduction.
منابع مشابه
Recovering capitalization and punctuation marks for automatic speech recognition: Case study for Portuguese broadcast news
The following material presents a study about recovering punctuation marks, and capitalization information from European Portuguese broadcast news speech transcriptions. Different approaches were tested for capitalization, both generative and discriminative, using: finite state transducers automatically built from language models; and maximum entropy models. Several resources were used, includi...
متن کاملA Lightweight on-the-fly Capitalization System for Automatic Speech Recognition
This paper describes a lightweight method for capitalizing speech transcriptions. Several resources were used, including a lexicon, newspaper written corpora and speech transcriptions. Different approaches were tested both generative and discriminative: finite state transducers, automatically built from Language Models; and maximum entropy models. Evaluation results are presented both for writt...
متن کاملUnsupervised training methods for discriminative language modeling
Discriminative language modeling (DLM) aims to choose the most accurate word sequence by reranking the alternatives output by the automatic speech recognizer (ASR). The conventional (supervised) way of training a DLM requires a large amount of acoustic recordings together with their manual reference transcriptions. These transcriptions are used to determine the target ranks of the ASR outputs, ...
متن کاملOnline Story Segmentation of Multilingual Streaming Broadcast News
We present an online story segmentation approach for Broadcast News (BN) that is built upon and integrated into BBN COTS multilingual Broadcast Monitoring System (BMS). We take a discriminative model-based approach, using Support Vector Machines to segment BN transcriptions into thematically coherent stories within the real-time constraints defined by BMS. We extract lexical, topical and story ...
متن کاملA Decade of Discriminative Language Modeling for Automatic Speech Recognition
This paper summarizes the research on discriminative language modeling focusing on its application to automatic speech recognition (ASR). A discriminative language model (DLM) is typically a linear or log-linear model consisting of a weight vector associated with a feature vector representation of a sentence. This flexible representation can include linguistically and statistically motivated fe...
متن کامل